Skip to content

feat: resolve ES/TS package specifiers for IMPORTS edges (#180)#184

Open
dLo999 wants to merge 3 commits intoDeusData:mainfrom
dLo999:feat/es-ts-package-map
Open

feat: resolve ES/TS package specifiers for IMPORTS edges (#180)#184
dLo999 wants to merge 3 commits intoDeusData:mainfrom
dLo999:feat/es-ts-package-map

Conversation

@dLo999
Copy link
Copy Markdown
Contributor

@dLo999 dLo999 commented Mar 29, 2026

Closes #180

Summary

Adds ES/TS package specifier resolution so that import { foo } from '@myorg/storage-utils' produces real IMPORTS edges. Previously, fqn_module() treated npm package specifiers as file paths, producing QNs that matched no graph nodes — result: zero IMPORTS edges for all package imports.

How it works

  1. Package map build (new Phase 1.5 in pipeline_run()): Walks the repo for package.json files, parses name + entry point (exports["."]mainsrc/index.ts fallback). Builds a CBMHashTable mapping package names to resolved module QNs.

  2. cbm_pipeline_resolve_module(ctx, module_path): New wrapper called instead of fqn_module() at IMPORTS-edge creation sites. If the specifier is a package reference and found in the map, returns the mapped QN. Handles subpath imports (@myorg/pkg/utils) by resolving relative to the package directory. Falls through to fqn_module() for relative paths and unknown packages.

  3. Zero behavior change for non-JS repos: When no package.json files exist, pkg_map is NULL and all resolution falls through to fqn_module(). 0ms overhead.

Changes

  • src/pipeline/pass_pkgmap.c (new) — package map build, free, and resolve functions
  • src/pipeline/pipeline_internal.hpkg_map field on ctx, cbm_pkg_entry_t struct, prototypes
  • src/pipeline/pipeline.c — Phase 1.5 build step, ctx initialization, cleanup
  • src/pipeline/pass_definitions.c:297resolve_module() instead of fqn_module()
  • src/pipeline/pass_calls.c:91 — same
  • src/pipeline/pass_usages.c:96 — same
  • src/pipeline/pass_semantic.c:81 — same
  • src/pipeline/pass_parallel.c:647 — same (in cbm_build_registry_from_cache)
  • Makefile.cbm — added pass_pkgmap.c and test_pkgmap.c
  • tests/test_pkgmap.c (new) — 20 unit tests

Test results

Build: Compiles cleanly on macOS (Apple Clang, arm64) with -Wall -Wextra -Werror

Test suite: 2761 passed, 0 failed (was 2741 — 20 new pkgmap tests)

Behavioral verification (monorepo matching issue scenario):

Created pnpm-like monorepo:

packages/storage-utils/package.json   {"name": "@myorg/storage-utils", "main": "src/index.ts"}
packages/storage-utils/src/index.ts   (exports validateDatasetName, buildSelectQuery)
apps/server/src/index.ts              (import { validateDatasetName } from '@myorg/storage-utils')
Query Before fix After fix
MATCH (a)-[r:IMPORTS]->(b) RETURN a, b 0 rows 1 row: apps/server/src/__file__packages/storage-utils/src/index
MATCH (a)-[r:CALLS]->(b) RETURN a, b 2 rows (via unique_name) 2 rows (unchanged — no regression)
pkg_map phase timing N/A 0ms (negligible)

Edge cases tested (unit tests):

Test Result
NULL pkg_map → falls through to fqn_module PASS
Relative import ./utils → falls through PASS
Scoped package @test/utils → exact match PASS
Subpath @test/utils/helpers → resolves relative to pkg dir PASS
Unknown package react → falls through PASS
Package without "name" → skipped PASS
Package without entry point → skipped PASS
exports["."]["import"] conditional → resolved PASS
No package.json in repo → NULL map, 0ms PASS

Scope boundary (not addressed)

  • Relative import resolution (./utils) — already works
  • tsconfig paths (@/components) — separate issue
  • Barrel re-export resolution — separate enhancement
  • External npm deps (in node_modules/) — outside project graph by design

Generated with agent-team via /issue

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
dLo999

This comment was marked as outdated.

@DeusData DeusData added enhancement New feature or request parsing/quality Graph extraction bugs, false positives, missing edges labels Apr 3, 2026
@DeusData
Copy link
Copy Markdown
Owner

DeusData commented Apr 7, 2026

Hey, thanks for publishing this PR. I will tackle this immediately after the 15th of April :)

…ith upstream

Conflicts resolved:
- Makefile.cbm: kept pass_pkgmap.c in PIPELINE_SRCS, added upstream's
  pass_similarity.c + pass_semantic_edges.c, kept new SIMHASH_SRCS /
  SEMANTIC_SRCS / UNIXCODER_BLOB_SRC sections; dropped stale httplink.c
  entry (removed upstream)
- pass_definitions.c: accepted upstream refactor to create_import_edges_for_file()
  helper; updated bare-specifier branch to use cbm_pipeline_resolve_module()
- pass_parallel.c: accepted upstream refactor to create_imports_edges() helper;
  updated bare-specifier branch to use cbm_pipeline_resolve_module()
- pass_usages.c: accepted upstream flat-loop cleanup (removed shadow variable
  re-declaration from PR); updated to use cbm_pipeline_resolve_module()
- pipeline.c: merged both .pkg_map = pkg_map (PR) and .mode = (int)p->mode
  (upstream) into ctx initializer; moved pkg_map declaration before first
  goto cleanup to fix -Wsometimes-uninitialized
Copy link
Copy Markdown
Contributor Author

@dLo999 dLo999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Adds ES/TS package specifier resolution for monorepo IMPORTS edges (closes #180). New pass_pkgmap.c walks the repo, parses package.json files (name + entry point via exports["."]/main/fallback), and builds a CBMHashTable mapping package names to resolved module QNs. cbm_pipeline_resolve_module(ctx, path) wraps fqn_module() at 5 pass sites; falls through for relative/absolute/unknown paths. Zero overhead for non-JS repos (NULL map).

Verification status: branch merged with upstream/main cleanly; 2,740 tests pass unsandboxed (49 "failures" in an earlier sandboxed run were mkdtemp restrictions, not regressions).

Update (commit 4e183de): nit fixes applied. Fix 2 (duplicate-key comment clarification) landed. Fix 1 (NULL-arg ternary split) intentionally skipped — would have been a behavior change (callers free() the return unconditionally; cbm_pipeline_fqn_module already returns strdup("") for NULL project, so the combined guard is load-bearing). 2,740 tests still pass post-fix.

Findings

  • [nit] pass_pkgmap.c:229-241 — duplicate package-name handling: comment now explicitly documents that cbm_ht_set stores the supplied key pointer directly (slot->key = cur.key), so allocating a new key without freeing the old would orphan it. Logic unchanged; rationale now clear. ✅ addressed in 4e183de.
  • [nit] pass_pkgmap.c cbm_pipeline_resolve_module() NULL-arg ternary — intentionally preserved. The combined if (!ctx || !module_path) routing through cbm_pipeline_fqn_module(ctx ? ctx->project_name : NULL, module_path) is load-bearing because callers free() the return unconditionally and fqn_module handles NULL project gracefully via strdup(""). Splitting would be a behavior change.
  • [nit] tests/test_pkgmap.c — 20 new tests cover NULL inputs, scoped/bare packages, exports["."] resolution, subpath resolution, relative/absolute fallthrough, unknown packages, missing fields. Good coverage.
  • [question] CI is not wired for this PR — no build/test job reports visible. Local unsandboxed make -f Makefile.cbm test is clean, but upstream CI should confirm across the matrix.

Draft review raised two CRITICAL findings — both verified false positives

A prior draft review (based on static diff inspection) flagged:

  1. "cbm_ht_get_key() is undefined" — refuted. Declared at src/foundation/hash_table.h:49, defined at src/foundation/hash_table.c:186, used across registry.c, graph_buffer.c, with dedicated tests in test_hash_table.c:187-205 (including NULL safety). The draft also cited wrong line number (claimed line 304; actual call is line 239).

  2. "Use-after-free in duplicate key handling" — refuted by careful struct-ownership trace. cbm_ht_get(...) returns the VALUE (cbm_pkg_entry_t*). free(prev) frees the value struct only; the key is stored separately in slot->key and is not touched. cbm_ht_get_key() returns slot->key, which remains valid. The subsequent cbm_ht_set(pkg_map, existing_key, entry) matches the same key via strcmp, updates slot->value, and reassigns slot->key to the same pointer (hash_table.c:244). No key memory is freed or invalidated in this flow.

Both claims would have been caught by actually running the build/tests, which do in fact pass.

CI Status

No CI checks reported on feat/es-ts-package-map. Local build + test (unsandboxed): clean, 2,740/2,740 pass (both pre- and post-fixup).

Verdict

APPROVE — semantically correct, consistent integration across 5 pass sites, good test coverage, zero-overhead fallthrough for non-JS repos. Recommend upstream CI wire-up as a separate concern.

Fix 2 only: the duplicate-package branch reuses the existing strdup'd key
in the hash-table slot rather than allocating a fresh one. Document why
this is intentional — cbm_ht_set stores the supplied pointer directly, so
allocating a new key without freeing the old one would be a leak.

Fix 1 (NULL-arg ternary split) was evaluated and skipped: splitting the
combined !ctx || !module_path guard would change the !module_path return
value from fqn_module(project_name, NULL) (a valid strdup'd string) to
NULL, which callers free() unconditionally — a regression risk.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request parsing/quality Graph extraction bugs, false positives, missing edges

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ES/TS module specifiers produce zero IMPORTS edges — pipeline resolves by name only

2 participants